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1  INTRODUCTION 


This  note  summarizes  a  two-day  workshop  held  in  La  Jolla,  California 
in  June  of  1988.  The  workshop  was  cosponsored  by  the  Center  for  Non¬ 
linear  Studies  at  the  Los  Alamos  National  Laboratory,  and  by  the  JASON 
group,  The  MITRE  Corporation  as  part  of  the  1988  JASON  Summer  Study. 
The  purpose  of  the  workshop  was  to  identify,  define,  and  begin  to  resolve 
substantive  issues  which  must  be  addressed  before  a  special  purpose  cellular 
automata  computer  can  be  implemented  in  hardware. 

The  workshop  attendees  were: 

•  George  Adams,  Purdue  University 

•  Gary  Doolen,  Los  Alamos  National  Laboratory 

•  Paul  Frederickson,  NASA  Ames,  RIAC  project 

•  Castor  Fu,  Stanford  University 

•  Brosl  Hasslacher,  Los  Alamos  National  Laboratory 

•  Fung  F.  Lee,  Stanford  University 

•  Norman  Margolus,  MIT  Laboratory  for  Computer  Science 

•  Tsutomu  Shimomura,  Los  Alamos  National  Laboratory 

•  Tom  Toffoli,  MIT  Laboratory  for  Computer  Science 

and  the  following  members  of  the  JASON  group: 

•  Kenneth  Case,  University  of  California  at  San  Diego 

•  Alvin  Despain,  University  of  California  at  Berkeley 

•  Freeman  Dyson,  Institute  for  Advanced  Study 

•  Michael  Freedman,  University  of  California  at  San  Diego 

•  Claire  Max,  Lawrence  Livermore  National  Laboratory 
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•  Oscar  Rothaus,  Cornell  University. 

Henry  Abarbanel,  a  JASON  from  the  University  of  California  at  San 
Diego,  was  not  able  to  attend  the  two  day  workshop,  but  did  participate  in 
planning  the  workshop  and  in  discussion  of  issues. 

The  primary  emphasis  of  the  workshop  was  on  the  use  of  cellular  au¬ 
tomata  for  simulations  of  three-dimensional  incompressible  Navier-Stokes 
hydrodynamics.  Within  this  context,  there  are  two  types  of  applications 
for  which  a  special  purpose  computer  might  offer  important  potential  advan¬ 
tages  over  conventional  numerical  hydrodynamics  techniques  implemented 
on  general  purpose  supercomputers: 


1.  Studies  of  flows  with  complex  boundary  conditions.  For  example,  one 
might  look  at  a  boundary-layer  and  study  various  techniques  that  have 
been  suggested  for  drag- reduction  and  boundary-layer  modification. 

2.  Studies  of  three-dimensional  incompressible  flows  at  high  Reynolds 
numbers.  These  could  include  studies  of  the  onset  of  fluid  turbulence, 
free-boundary  problems  (such  as  ship  wakes  and  drag),  or  the  combi¬ 
nation  of  hydrodynamics  and  simple  chemical  reaction  systems. 


The  issues  discussed  at  the  workshop  fall  into  three  general  categories:  theory, 
computer  simulation,  and  hardware. 
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2  THEORETICAL  ISSUES 


The  most  prevalent  use  of  cellular  automata  for  modeling  hydrodynamics 
has  been  the  so-called  lattice  gas.  In  this  approach,  one  follows  the  motions 
of  many  individual  particles  which  interact  via  given  collision  laws  at  fixed 
lattice  sites  or  nodes.  The  individual  particles  are  allowed  to  have  at  most  a 
few  discrete  speeds  relative  to  the  grid  of  lattice  nodes.  The  hydrodynamic 
limit  is  regained  by  averaging  over  a  large  number  of  these  discrete  particles, 
to  obtain  the  first  few  moments  of  their  distribution  function;  namely,  the 
fluid  velocity,  density,  etc.  In  two  spatial  dimensions,  the  properties  of  possi¬ 
ble  sets  of  collision  rules  for  the  particles  and  lattice  geometries  for  the  nodes 
are  now  reasonably  well  understood.  There  are  two  practical  ways  to  repre¬ 
sent  a  given  rule  set:  via  a  look-up  table  which  enumerates  all  the  possible 
incoming  and  outgoing  configurations,  or  via  an  algorithm  or  computation 
which  generates  the  rules  anew  at  each  timestep  and  each  collision  site. 

However  in  three  spatial  dimensions  the  possible  rules  sets  are  far  more 
complicated,  and  there  are  many  unsolved  questions  regarding  appropri¬ 
ate  collision  rules  and  their  efficient  execution.  For  maximum  efficiency, 
a  special-purpose  lattice-gas  computer  should  probably  contain  a  hard-wired 
implementation  of  a  particular  rule  set.  However  the  general  consensus  at 
the  workshop  was  that  there  is  not  yet  a  sufficient  understanding  of  rule 
sets  that  have  been  proposed  for  three  spatial  dimensions  to  settle  upon  an 
optimum  one  for  hardware  implementation. 

Important  issues  that  remain  to  be  solved  concerning  collision  rules  for 
three-dimensional  hydrodynamics  are  the  following: 


1.  What  is  the  “best”  rule  set  to  use  for  modeling  three-dimensional  hy¬ 
drodynamics? 

(a)  How  does  the  choice  of  this  “best"  rule  set  change  with  the  type 
of  application  one  wants  to  solve?  For  example,  are  some  rule 
sets  better  for  studies  of  boundary-layer  effects  of  free- boundary 
problems,  while  others  are  optimum  for  studying  the  onset  of  tur¬ 
bulence  at  high  Reynolds  number? 
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(b)  How  can  rules  be  “tuned”  to  get  optimum  results  for  given  problem 
parameters?  For  example,  how  can  one  optimize  for  high  Reynolds 
number,  or  for  specific  types  of  boundary  conditions? 

2.  Rules  for  lattice  gases  representing  three-dimensional  hydrodynamics 
tend  to  be  very  complicated.  One  way  to  implement  them  computa¬ 
tionally  is  using  a  look-up  table,  but  these  become  very  large.  If  there 
are  n  bits  at  each  lattice  site,  then  there  are  2n  table  entries.  For  ex¬ 
ample,  the  24  bit  model  requires  16  million  entries.  How  can  this  large 
number  of  rules  be  reduced  by  “factoring”  or  “grouping”  them,  to  re¬ 
duce  the  size  of  the  rule  representation  in  the  look-up  table?  What  is 
the  fundamental  dimension  of  the  rule  set? 

3.  In  several  proposed  rule  sets,  one  has  to  choose  whether  the  same  colli¬ 
sion  will  always  have  the  same  outcome,  or  whether  one  will  implement 
a  randomization  process  within  the  rule  set  to  “mix  up”  the  collision 
outcomes.  The  addition  of  an  explicit  randomization  procedure  is  ex¬ 
pensive  computationally.  Under  what  circumstances  can  one  rely  on 
the  inherently  high  frequency  of  particle  collisions  to  achieve  random¬ 
ization,  so  that  it  does  not  have  to  be  explicitly  included  in  the  rule 
engine? 

4.  A  related  question  concerns  the  desirability  of  adding  a  “collision  bit” 
to  the  algorithm.  This  is  an  additional  bit  determining  whether  a  par¬ 
ticle  will  or  will  not  undergo  a  collision  at  the  next  lattice  node  that  it 
reaches,  if  all  the  other  conditions  for  a  collision  at  that  node  are  satis¬ 
fied.  If  all  particles  undergo  collisions  whenever  they  can  (no  collision 
bit),  one  obtains  a  more  “collisional”  rule  set,  leading  to  the  poten¬ 
tial  for  attaining  higher  Reynolds  numbers.  Are  there  circumstances  in 
which  a  less  collisional  rule  set  would  be  desirable? 

5.  What  advantages  are  there  to  using  non-periodic  tiling  or  quasilattices 
for  modeling  three-dimensional  hydrodynamics,  as  compared  with  the 
so-called  four-dimensional  schemes  or  other  periodic  tiling  schemes? 

6.  Is  there  a  lattice-gas  analog  for  adaptive-mesh  hydrodynamic  tech¬ 
niques,  so  that  greater  spatial  resolution  can  be  achieved  in  regions 
where  it  is  needed?  Can  sub-grid  scaling  rules  be  derived  to  extend  the 
spatial  resolution  of  the  lattice  gas  method? 
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7.  What  physical  laws  or  partial  differential  equations  do  the  various 
rule  sets  represent?  Can  the  differences  between  the  Navier-Stokes 
equations  and  the  lattice  gas  implementations  be  systematically  un¬ 
derstood? 

(a)  Under  what  conditions  (limits  on  the  Mach  number,  particle  den¬ 
sity,  Reynolds  number)  does  the  lattice  gas  model  with  a  given  rule 
set  reduce  to  three-dimensional  Navier-Stokes  hydrodynamics? 

(b)  Given  a  set  of  physical  constraints,  can  an  algorithm  be  developed 
that  will  systematically  generate  a  corresponding  lattice  gas  rule 
set? 

(c)  Each  given  rule  set  implies  a  particular  functional  form  for  the 
viscosity  as  a  function  of  density.  Given  that  the  density  is  nearly 
constant  in  space  for  incompressible  flows  with  Mach  numbers 
small  compared  to  unity,  does  it  matter  whether  or  not  the  density- 
dependence  of  the  viscosity  law  is  physical? 

(d)  It  has  been  suggested  that  the  nonphysical  function  g(p)  appear¬ 
ing  in  front  of  the  u  ■  Vu  term  in  the  momentum  equation  can 
be  eliminated  by  using  rules  which  include  two  or  more  discrete 
(nonzero)  velocities.  Is  this  generally  valid?  Under  what  condi¬ 
tions  would  it  be  desirable  to  use  more  than  one  particle  velocity? 
What  is  the  gain  in  accessible  Reynolds  number  when  additional 
speeds  are  allowed?  Are  there  advantages  to  these  schemes  that 
would  allow  lattice  gases  to  satisfy  statistics  other  than  Fermi 
statistics?  (The  latter  prevail  for  most  currently  used  rules.) 

In  addition  to  the  above  questions  concerning  rules  for  lattice-gas  rep¬ 
resentations  of  hydrodynamics,  there  are  a  set  of  issues  involving  exten¬ 
sions  of  the  cellular  automata  methodology  to  other  physical  models: 

(a)  Can  hydrodynamics  be  modeled  by  using  cellular  automata  parti¬ 
cles  to  represent  vorticity,  in  analogy  with  finite-difference  vorticity- 
tracking  algorithms?  What  range  of  Reynolds  numbers  could  such 
a  technique  model? 

(b)  How  practical  would  it  be  to  add  some  simple  extensions  to  three- 
dimensional  lattice-gas  hydrodynamics?  Some  extensions  that 
would  be  useful  include  gravity  or  other  body-forces,  two  or  more 
different  fluid  types,  or  simple  chemistry.  It  was  generally  agreed 


at  the  workshop  that  extensions  which  involve  action-at-a-distance, 
such  as  Maxwell’s  equations,  would  require  a  very  different  algo¬ 
rithmic  approach. 
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3  ISSUES  CONCERNING  NUMERICAL  SIM¬ 
ULATIONS 


The  workshop  participants  felt  that  there  was  a  need  to  develop  “bench¬ 
mark”  simulation  problems.  These  would  consist  of  a  few  canonical  two  and 
three-dimensional  hydrodynamics  problems,  for  which  the  numerical  results 
of  various  lattice-gas  models  could  be  compared  with  each  other,  with  con¬ 
ventional  hydrodynamics  simulations,  and  with  experimental  results.  This 
is  particularly  important  because  of  the  fact  that  different  lattice-gas  rule 
sets  may  represent  different  approximations  to  the  Navier-Stokes  equations 
(i.e..  they  may  approach  the  Navier-Stokes  equations  in  different  asymptotic 
limits). 

A  parallel  effort  should  be  made  to  compare  lattice-gas  simulation  results 
with  standard  analytic  solutions  to  the  Navier-Stokes  equations,  in  cases 
where  these  are  known.  Possible  examples  are  channel  flow,  pipe  flow,  Pou- 
seille  flow,  Couette  flow,  and  so  forth.  This  has  been  done  to  a  limited  extent 
for  two-dimensional  lattice-gas  models,  but  three-dimensional  applications 
have  not  yet  been  well  studied. 

A  different  type  of  test  of  lattice-gas  algorithms  was  thought  to  be  impor¬ 
tant  as  well.  One  should  perform  the  standard  numerical  test  of  increasing 
the  grid  resolution,  while  holding  fixed  all  of  the  “physical”  parameters  de¬ 
scribing  the  problem.  The  goal  would  be  to  check  that  the  higher-resolution 
result  is  identical  to  that  obtained  with  lower  numerical  resolution. 

A  final  numerical  simulation  issue  thought  to  be  important  by  the  work¬ 
shop  participants  concerns  how  to  generate  adequate  graphical  visualiza*ions 
of  the  results  of  a  three-dimensional  lattice- gas  simulation.  It  was  pointed 
out  that  the  amount  of  data  storage  needed  for  a  three-dimensional  simula¬ 
tion  at  high  Reynolds  number  will  be  very  high.  Therefore,  thought  must  be 
given  to  how  to  integrate  input-output  and  graphical  display  within  the  pro¬ 
cess  of  the  numerical  computation  itself.  For  the  types  of  physical  problems 
which  one  wants  to  address  using  lattice  gases,  it  may  not  be  adequate  to 
obtain  graphical  displays  of  the  results  based  entirely  upon  post-processing. 
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4  ISSUES  CONCERNING  THE  HARDWARE 
PERFORMANCE  OF  A  SPECIAL-PURPOSE 
LATTICE-GAS  COMPUTER 


In  order  to  focus  on  the  issue  of  hardware  design  for  a  lattice-gas  machine, 
a  set  of  performance  measures  was  chosen.  The  idea  was  to  outline  hypothet¬ 
ical  specifications  for  hardware  components,  so  that  when  different  candidate 
architectures  were  compared  with  each  other,  they  would  all  be  making  the 
same  assumptions  about  the  capabilities  of  commonly  used  hardware  com¬ 
ponents.  The  following  table  gives  a  rough  overview  of  the  capabilities  of 
VLSI  technology  today  and  in  5  years. 


Table  1:  VLSI  Technology  (CMOS) 


Today 

In  Five  Years 

1  cm2  active  area 

1  cm2 

200  pins 

400  pins 

1  Mbit  DRAM 

4  Mbit 

50K  "random”  transistors 

200K 

10  nsec  internal  clock 

1  nsec 

(on-chip  communications) 

80  nsec  external  drive 

8  nsec 

(off-chip  communications) 

Using  these  characteristics,  which  are  of  course  only  approximate,  one 
can  outline  the  characteristics  and  performance  of  various  architectures  for 
a  lattice  gas  supercomputer. 

There  appears  to  be  a  practical  limit  on  the  total  number  of  chips  it 
is  plausible  to  include  in  a  supercomputer.  Today’s  Crays  have  about  a 
third  of  a  million  chips.  Workshop  participants  hypothesized  that  in  the 
future  one  might  build  supercomputers  with  up  to  a  million  chips.  Since  the 
total  number  of  lattice  points  required  for  a  lattice-gas  computation  of  high 
Reynolds  number  three-dimensional  hydrodynamics  is  much  larger  than  a 
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million,  one  is  led  to  a  design  in  which  many  cellular  automata  lattice  points 
are  placed  on  each  chip. 

The  next  hardware  issue  is  how  to  implement  the  set  of  collision  rules. 
Since  only  a  few  rule  sets  for  three  spatial  dimensions  have  been  studied 
to  date,  the  workshop  participants  felt  that  it  was  premature  to  choose  a 
specific  rule  set  for  implementation  in  hardware.  It  was  suggested  that  even 
after  more  three-dimensional  rules  have  been  studied,  it  would  be  desirable 
to  leave  flexibility  in  the  choice  of  rules  for  the  lattice-gas  supercomputer. 
There  are  two  reasons  for  this  choice.  First,  a  new  and  better  set  of  collision 
rules  for  hydrodynamics  might  be  invented  at  any  time;  and  second,  one 
may  at  some  later  point  want  to  use  the  lattice-gas  computer  to  study  other 
physical  models  such  as  the  mixing  of  two  different  gases,  or  hydrodynamics 
with  simple  chemistry. 

Flexibility  in  the  choice  of  rule  set  would  have  the  most  str"'ghtforward 
implementation  if  collision  rules  were  executed  via  look-up  ta  ,s.  In  that 
case  one  could  feed  an  alternative  look-up  table  into  the  computer  when  one 
wanted  to  change  rules.  The  difficulty  with  this  approach  is  that  the  rules 
suggested  to  date  for  hydrodynamics  in  three  spatial  dimensions  would  re¬ 
quire  very  large  look-up  tables,  with  the  disadvantages  that  the  tables  would 
use  up  large  amounts  of  memory  and  would  be  slow  to  compute  collisions. 

Thus  there  is  a  lot  to  be  gained  by  understanding  the  symmetries  under¬ 
lying  each  proposed  rule  set,  so  that  the  look-up  table  can  be  collapsed  into 
a  considerably  smaller  amount  of  memory  space. 

The  second  method  of  implementing  a  rule  set  is  to  design  a  computation 
engine  in  hardware  that  would  recalculate  the  rules  “on  the  fly”  for  each 
collision.  The  advantage  of  this  technique  relative  to  a  look-up  table  approach 
is  that  it  is  preferable  from  the  point  of  view  of  speed  and  feasibility.  The 
hardware  rule-engine  is  less  flexible  than  a  table  look-up  approach,  unless 
a  software  layer  can  be  added  to  customize  the  rule  engine  for  a  choice  of 
several  different  rule  sets. 

Since  many  lattice  points  reside  on  each  chip,  and  since  off-chip  commu¬ 
nications  are  slower  than  those  that  remain  on-chip,  it  seems  desirable  to 
locate  on  each  chip  the  look-up  tables  or  hardware  rule-engines  which  calcu¬ 
late  the  collision  outcomes.  This  avoids  the  time  delays  which  would  occur 
if  one  had  to  go  off-chip  to  calculate  collision  outcomes.  If  there  are  many 
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lattice  points  on  a  chip,  one  may  want  to  have  many  “computational  nodes” 
on  each  chip.  (Here  a  “computational  node”  is  defined  to  be  a  look-up  table 
or  a  rule-engine  for  calculating  collision  outcomes.)  This  would  avoid  the 
time  delays  inherent  in  updating  all  of  the  lattice  points  on  a  given  chip 
sequentially. 

Thus  one  must  choose  how  to  trade  off  the  number  of  lattice  points  which 
can  be  stored  on  a  chip  with  the  number  of  “computational  nodes”  that  will 
fit  on  a  chip.  The  results  of  this  trade-off  will  probably  vary  with  the  specific 
type  of  rule  set  chose,  since  the  size  and  complexity  of  the  “computational 
node”  and  the  number  of  bits  required  for  a  lattice  point  will  in  general  vary. 
In  the  example  shown  in  the  following  table,  it  was  decided  to  allocate  half 
of  the  chip  space  to  lattice  points  and  half  to  “computational  nodes.” 

With  the  above  discussion  as  background,  the  workshop  arrived  at  the 
following  the  target  performance  characteristics  of  a  hypothetical  lattice-gas 
computer: 


Table  2:  Target  Performance  Parameters 

Problem  definition: 

Three-dimensional  incompressible  hydrodynamics 
Flexible  boundary  conditions 
Some  flexibility  in  the  rule  set 
If  possible,  high  input-output  rate 

Hardware  aspects: 

512  lattice  points  per  “computational  node” 

64  “computational  nodes”  per  chip 
32,000  lattice  points  per  chip 
About  a  million  chips  total 

3  x  lO10  lattice  points  total  (>  10n  within  5  years) 
10  nsec  update  rate  (on-chip) 

About  64  x  108  site  updates  per  chip  per  second 

About  6  x  1015  site  updates/sec  total 

(>  6  x  1016  site  updates/sec  within  5  years) 
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5  CONCLUSIONS 


The  workshop  in  La  Jolla  produced  a  considerable  amount  of  enthusi¬ 
asm  about  the  potential  of  a  dedicated  special-purpose  lattice-gas  computer. 
Preliminary  estimates  based  on  the  above  performance  numbers  suggest  that 
such  a  machine  could  surpass  the  present  performance  of  a  general-purpose 
supercomputer  such  as  the  Cray  II  (3  x  107  site  updates/sec)  by  a  factor 
of  about  10®  and  possibly  considerably  more.  Of  course  the  target  machine 
could  be  expensive;  first-of-a-kind  supercomputers  can  cost  from  tens  to  a 
hundred  million  dollars.  The  cost  of  this  machine  is  proportional  to  the  num¬ 
ber  of  chips,  so  a  reduction  in  chip  count  by  a  factor  of  10  would  result  in  a 
factor  of  10  reduction  in  cost. 

In  view  of  the  combination  of  large  cost  and  high  scientific  potential  for 
such  a  machine,  it  will  be  imperative  to  proceed  along  two  parallel  paths:  1) 
refinement  of  the  theoretical  understanding  of  cellular  automata  rules  and 
lattices  in  three  spatial  dimensions,  and  2)  building  of  intermediate-scale 
hardware  implementations  of  dedicated  cellular  automata  computational  en¬ 
gines,  so  as  to  gain  expertise  in  the  practical  areas  of  architecture  tradeoffs 
and  implementations.  A  very  good  example  of  such  an  intermediate  engine  is 
the  CAM-8  machine  recently  proposed  by  Margolus  and  Toffoli,  which  would 
deliver  2  x  lO10  site  updates  per  second  with  16  bits  per  site.  Progress  along 
both  of  these  paths  will  be  necessary  in  order  to  learn  how  to  best  exploit 
the  potential  of  a  dedicated  lattice-gas  supercomputer. 
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